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Recently, algorithms of machine learning are widely used with the field of 
electroencephalography (EEG) brain-computer interfaces (BCI). The 
preprocessing stage for the EEG signals is performed by applying the 
principle component analysis (PCA) algorithm to extract the important 
features and reducing the data redundancy. A model for classifying EEG, 


time series, signals for facial expression and some motor execution processes 





had been designed. A neural network of three hidden layers with deep 
Keywords: learning classifier had been used in this work. Data of four different subjects 
were collected by using a 14 channels Emotiv EPOC+ device. EEG dataset 
BCI f : . : : 
samples including ten action classes for the facial expression and some motor 


Deep learning execution movements are recorded. A classification results with accuracy 


EEG range (91.25-95.75%) for the collected samples were obtained with respect 
Nueral network to: number of samples for each class, total number of EEG dataset samples 
PCA and type of activation function within the hidden and the output layer 


neurons. A time series EEG signal was taken as signal values not as image or 
histogram, analysed and classified with deep learning to obtain the satisfied 
results of accuracy. 
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1. INTRODUCTION 

It is well known that, the system which connects human brain signals with appliances or devices 
without requiring of any physical contact is called brain-computer interfaces (BCI). It has been seen as a new 
way for communication, where the brain activity has been used as a reflected form by electric brain signals to 
manage external system such as computers, wheelchairs, switches, or neuro prosthetic extensions [1 ]-[6]. 

Electroencephalography (EEG) is the process of fetching the electrical brain’s signals and recording 
them, so the activity of human can be analyzed making the real processing of the brain clear to the user. 
Electrodes are put on the human scalp, in an easy way, to collect brain’s electrical signals. An EEG signal is 
band limited in frequency (0.1-60 Hz), EEG signals are modeled and classified into five types: (theta, delta, 
beta, alpha, and gamma waves), which are responsible to capture different associated brain activities inside 
the brain [7], [8]. EEG signals contain a high redundancy in the collected data, so the important stage before 
being classifying those signals, is feature extraction stage. In fact, a feature illustrates a distinctive attribute, 
identifiable measure, and functional element getting from a segment of samples. Feature extraction used to 
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maintain the significant information in the signal and minimizing their lost as much as possible, as well as to 
simplify the needed resources for describing the huge amount of data in an accurate manner. So, this will lead 
to a simple implementation that reduces the processing cost for the information, and eliminates the need for 
data compression [9]-[14]. In this work, principle component analysis (PCA) method is used for 
unsupervised feature extraction process. This method is a descriptive statistical technique which describes the 
differences between the samples of the dataset and the most correlated samples. PCA detects the principle 
component of dataset of the signal, so it will perform the dimension reduction of the data [15]. 

Algorithms for classifying EEG-based BCIs were classified into four main classes: matrix 
and tensor, adaptive, deep learning, and transfer learning classifiers as well as a few other diverse 
classifiers [2], [12], [16]-[20]. In EEG researches, machine learning had been used to discover the related 
information for neuroimagingy and neural classification. The advances in machine learning and the 
availability of huge EEG data sets led to deep learning deployment in analyzing EEG signals and in the field 
of understanding brain functionality by defining collected information inside it [6], [21]-[24]. The use of 
deep learning with EEG applications in genera,| fell into five groups: motor imagery,emotion recognition, 
mental task workload, seizure observation, event related potential (ERP) tasks detection, and sleep states 
recording [25]. 


2. RESEARCH METHOD 

The work in this paper focuses on EEG signal features to identify the EEG signals for facial 
expressions (FEs) and some motor execution actions. FEs include: surprise, smile, left wink, right wink, and 
mouth opened. While, motor execution actions include: right hand lifting, left hand lifting, right rotating of 
head, left rotating of head , and clapping. All these signals first collected by Emotiv EPOC+ 14 channel 
mobile brainwear headset, and fetched by the licensed software of Emotiv Pro with python environment. A 
model for classifying those signals had been designed. Figure 1 shows the research methodology block 
diagram. The detail of each step will be explained in the next subsections. 
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Figure 1. Research methodology block diagram 





2.1. Data collection 

The first stage of research methodology begins with collecting dataset samples by using Emotiv 
Epoc+ head set device with 14 channels extended around the head. The data was collected from four subjects 
with different ages (10-50 years), males and females while they doing the required facial expressions and the 
motor execution actions. The EEG signals were recorded by the monthly licensed Emotiv software (Emotiv 
Pro) and saved as excel files (.csv files) to be used later in training the neural network within python 
environment. during the recording process about 6487 EEG samples were collected. Table 1 shows some 
samples of the collected EEG data for lifting left hand for one subject. 


2.2. Data pre-processing 

This stage is the artifacts removal of EEG signals, which is doing by the Emotiv headset itself, 
where the data is recorded directly as it is received from the headset. There is a good amount of signal 
processing and filtering in the headset to remove artifacts and harmonic frequencies. So, the signals appear 
clean when we gained a good contact quality. The signals had been sampled at 2048 Hz sampling frequency, 
and then applied to a dual notch filter at 50 Hz and 60 Hz as well as a low pass filter at 64 Hz cutoff 
frequency. Finally, the data was filtered down to 128 or 256 Hz. 


2.3. Feature extraction 
In this stage, the obtained preprocessed data from Emotiv headset is processed with PCA algorithm 
to improve the classifier's accuracy. PCA is a technique used for reduction of dimensionality of the large data 
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sets. This can be achieved by converting the huge set of variables into a smaller one which contains most of 
the information in the large set [15], [26]. We have 6487 samples from each one of the 14 channels of the 
headset. To implement PCA, the mean values must be computed firstly, so that we can compute the 
standardization (Z) of the initial values of the dataset, as in (1), to transform all the variables to the same 
range [26]. 

Value—mean 


z= (1) 


Standard deviation 


Table 1. EEG datasets samples example for left hand lifting from Emotiv headset 





EEG. EEG. EEG. EEG.F EEG. EEG. EEG. EEG. EEG. EEG. EEGF EEG. EEG. EEG. 





AF3 F7 F3 C5 T7 P7 Ol 02 P8 T8 C6 F4 F8 AF4 
4627.3 6175. 7376. 7018.4 7112. 3223. 3223. 1274. 3862. 629.7 8023.9 7521. 8334. 3344.3 
08 769 923 61 82 461 846 103 051 436 74 795 743 59 
7502.4 5615. 6324. 6095.5 1287. 7299. 7261. 6439. 1462. 7645. 3923.9 3256. 107.6 6799.6 
36 513 231 13 821 872 41 615 692 513 74 539 923 15 
5662.1 5518. 4492. 3957.4 5409. 4291. 715.6 4300. 3987. 6205. 7224.1 7947. 3203. 4669.1 
8 461 308 36 359 667 411 897 821 385 03 18 974 03 
5591.5 5528. 4493. 4027.3 5410. 4219. 723.8 4265. 3929. 6159. 7237.6 7829. 3138. 4696.3 
66 655 29 43 139 964 898 287 946 411 89 861 312 39 
5520.9 5538. 4494. 4097.2 5410. 4148. 732.1 4229. 3872. 6113. 7251.2 7712. 3072. 4723.5 
52 849 271 5 919 262 386 675 07 437 76 542 649 76 
5450.3 5549. 4495. 4167.1 5411. 4076. 740.3 4194. 3814. 6067. 7264.8 7595. 3006. 4750.8 
38 042 254 58 699 56 873 064 195 463 64 224 986 13 
5379.7 5559. 4496. 4237.0 5412. 4004. 748.6 4158. 3756. 6021. 7278.4 7477. 2941. 4778.0 
25 236 236 65 479 858 361 453 32 489 51 905 323 5 
5309.1 5569. 4497. 4306.9 5413. 3933. 756.8 4122. 3698. 5975. 7292.0 7360. 2875. 4805.2 
11 43 218 72 26 156 849 842 445 516 38 586 66 87 
5238.4 5579. 4498. 4376.8 5414. 3861. 765.1 4087. 3640. 5929. 7305.6 7243. 2809. 4832.5 
97 624 2 79 04 454 337 231 57 542 25 268 997 23 
5167.8 5589 4499 4446.7 5414 3789 773.3 4051 3582 5883 7319.2 7125 2744 4859.7 
83 817 182 87 82 752 824 62 695 568 12 949 334 6 
5097.2 5600 4500 4516.6 5415 3718 781.6 4016. 3524 5837 7332.7 7008 2678 4886.9 
7 011 164 94 6 05 312 009 82 594 99 631 672 97 
5026.6 5610 4501 4586.6 5416 3646 789.8 3980. 3466 5791 7346.3 6891 2613 4914.2 
56 205 146 01 38 347 8 398 945 62 86 312 009 33 
4956.0 5620 4502 4656.5 5417 3574 798.1 3944 3409 5745 7359.9 6773 2547 4941.4 
42 398 127 08 16 646 287 787 07 646 73 994 346 7 
4885.4 5630 4503 4726.4 5417 3502 806.3 3909. 3351 5699 7373.5 6656 2481 4968.7 
28 592 11 16 94 943 775 176 195 673 6 675 683 07 
4814.8 5640. 4504. 4796.3 5418. 3431. 814.6 3873. 3293. 5653. 7387.1 6539. 2416. 4995.9 
14 786 092 23 721 241 263 565 32 699 47 356 02 44 





The second step of PCA is to compute the covariance matrix, to check if there is any relationship or 
correlation between the variables of the dataset to reduce the information redundancy as much as possible. 
First of all, the covariance between all potential pairs of the initial dataset variables was computed using (2), 
in order to instruct the entries of the covariance matrix, which is a pxp symmetric matrix. 


P Xi¥i-PpXY 


cov[X,Y] = 2 (2) 


where; 
X means the mean value of variable X 
p is the dimension’s number 

The third step of PCA is to compute the eigenvectors and eigenvalues for the dataset values, in order 
to locate their principal components. The principal components are the new uncorrelated variables and have 
the most of information about the dataset is compressed in the first components and it gradually descends. 
The fourth step is to find the feature vector, which is represented by matrix with columns of eigenvectors for 
the required component from the previous step. This will lead to keep only k components (eigenvectors) 
instead of the total number of them (p). The final step of PCA is the reformation of the original dataset axis to 
the axis of the selected principal components, by multiplying the transpose of feature vector as in (3): 


Final dataset = FeatureVector! * ZT (3) 
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2.4. Classification model development 

In this work, a neural network with deep learning was built to classify the EEG signals for the ten 
actions including facial expression and motor execution. The main facility of applying deep learning 
mechanism is that, it often continues to improve as the size of the dataset increases. This task was 
implemented with spider3.3.1\Python environment by importing Keras libraries, which is a deep learning 
API written in Python. A Sequential model, which is a linear stack of layers, with 3 hidden layers which 
contain (1024, 512 and 256) neurons respectively was built, with activation function of type tanh(X). The 
output layer consists of 10 output neurons with activation function of type softmax(X). Figure 2 shows the 
sequential model of the work. 


Sequential Model 





Input EEG samples from 
14 channels 








3 Hidden Layer | Input st 
Output 256 

















Output Layer Input 256 
Output 10 Output 10 classes 


Figure 2. Sequential model representation 


2.5. Preformance evaluation 

The collected dataset samples are divided into two groups: 80% training dataset and 20% testing 
dataset to construct the sequential model of the classification to be tested. The performance is evaluated in 
each epoch with respect to two parameters: loss-values and accuracy of the classification. Accuracy 
calculates the percentage of predicted values (yPred) that match with actual values (yTrue). When running 
the model, important parameters effect must be observed since they significantly affect the accuracy and the 
processing time of the classification process. The parameters include: number of samples for each class, total 
number of samples, and the type of the activation function applied within the hidden and output layers 
neurons. When using an equal number of samples for each class, this will give better classification accuracy 
than those with a random number of samples per class as well as to the obvious reduction in the number of 
epochs required to train the neural network, and hence the overall processing time will be reduced, as shown 
in Figure 3. 

The total number of samples is the size of the collected samples, as this size increases the deep 
learning will give a better classification results but this increase cannot be continued since the processing 
time will be increased as well as to the stability of the accuracy results to a specific value. Finally, there are 
many types of activation functions such as: sigmoid, relu, softmax, tanh and exponential activation function, 
so after implementing those types within the hidden layer's neurons. The most acceptable accuracy level was 
obtained when using tanh(x) activation function, while the softmax(x) was used within the output layers 
neurons. Root mean square (RMS) optimizer was used to minimize the error while learning the neural 
network. 





Classification of EEG signals for facial expression and motor execution with .... (Areej Hameed Al-Anbary) 


159 O ISSN: 1693-6930 





Total no. of Accuracy % Accuracy % 
samples (epoch 150) (epoch 100) 
6487 93.04% 91.75%| random no. of samples/class 

5943 91.94% 92.43%| random no. of samples/class 

4168 93.01% 94.24%| random no. of samples/class 





No.of samples \class 98.00% 





96.00% 











94.00% 


3173 88.73% 93.14%| random no. of samples/class 
2270 93.77% 93.72%| random no. of samples/class = 
2001 93.56% 93.00%|200 samples / class aoe 
Accuracy 
973 94.47%| 95.75%] 100 samples / class E epoch 150 
90.00% 7 
751 93.33% 93.00%] 75 samples / class E epoch 100 
501 95.25% 95.00%|50 samples / class 
276 92.73% 93.64%|25 samples / class 88.00% 
101 92.50% TS] 10 samples / class 
86.00% + 
84.00% + at es at 


6487 5943 4168 3173 2270 2001 973 751 501 276 101 
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Figure 3. Classification accuracy levels 


3. RESULTS AND DISCUSION 

Firstly, EEG signals classification of 10 classes of facial expressions and motor executions actions 
was implemented for four subjects. The performance of the classification model was evaluated, as mentioned 
in the previous section. The training accuracy is ranging from (91.25% to 95.75%), and the best results were 
obtained when training the model with 100 samples/class with 973 total number of samples. These results 
will be used in the future work with many applications such as binding those classes with specific tenses or 
words in order to help the speechless persons to represent their thoughts, so the main goal of this paper is to 
design a simple EEG classifier, to be utilized for helping the speechless persons, so that giving them the 
ability to represent their intended thoughts. 


4. CONCLUSIONS 

In this paper, ten classes of EEG time series signal values were classified by building deep neural 
network and implementing deep learning techniques. A specialized dataset samples was recorded. In the 
offline training, the classification accuracy results reached to 95.75% with minimizing cost of computation 
and storage requirements by applying only the PCA algorithm on EEG data set signals values without any 
other filtering as well as to feed the deep nueral network with EEG signal values not as image or histogram. 
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